AITopics | meta reinforcement learning

Collaborating Authors

meta reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach

Neural Information Processing SystemsDec-24-2025, 06:13:14 GMT

In meta reinforcement learning (meta RL), an agent learns from a set of training tasks how to quickly solve a new task, drawn from the same task distribution. The optimal meta RL policy, a.k.a.~the Bayes-optimal behavior, is well defined, and guarantees optimal reward in expectation, taken with respect to the task distribution. The question we explore in this work is how many training tasks are required to guarantee approximately optimal behavior with high probability. Recent work provided the first such PAC analysis for a model-free setting, where a history-dependent policy was learned from the training tasks. In this work, we propose a different approach: directly learn the task distribution, using density estimation techniques, and then train a policy on the learned task distribution. We show that our approach leads to bounds that depend on the dimension of the task distribution. In particular, in settings where the task distribution lies in a low-dimensional manifold, we extend our analysis to use dimensionality reduction techniques and account for such structure, obtaining significantly better bounds than previous work, which strictly depend on the number of states and actions. The key of our approach is the regularization implied by the kernel density estimation method. We further demonstrate that this regularization is useful in practice, when `plugged in' the state-of-the-art VariBAD meta RL algorithm.

finite training task, meta reinforcement learning, task distribution, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach

Neural Information Processing SystemsOct-11-2024, 04:08:09 GMT

density estimation approach, meta reinforcement learning, task distribution, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.77)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Theoretical Analysis of Meta Reinforcement Learning: Generalization Bounds and Convergence Guarantees

Wang, Cangqing, Sui, Mingxiu, Sun, Dan, Zhang, Zecheng, Zhou, Yan

arXiv.org Artificial IntelligenceMay-21-2024

This research delves deeply into Meta Reinforcement Learning (Meta RL) through a exploration focusing on defining generalization limits and ensuring convergence. By employing a approach this article introduces an innovative theoretical framework to meticulously assess the effectiveness and performance of Meta RL algorithms. We present an explanation of generalization limits measuring how well these algorithms can adapt to learning tasks while maintaining consistent results. Our analysis delves into the factors that impact the adaptability of Meta RL revealing the relationship, between algorithm design and task complexity. Additionally we establish convergence assurances by proving conditions under which Meta RL strategies are guaranteed to converge towards solutions. We examine the convergence behaviors of Meta RL algorithms across scenarios providing a comprehensive understanding of the driving forces behind their long term performance. This exploration covers both convergence and real time efficiency offering a perspective, on the capabilities of these algorithms.

algorithm, generalization, meta-rl algorithm, (12 more...)

arXiv.org Artificial Intelligence

2405.1329

Country:

North America > United States > New York > Kings County > New York City (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Evolving Reservoirs for Meta Reinforcement Learning

Léger, Corentin, Hamon, Gautier, Nisioti, Eleni, Hinaut, Xavier, Moulin-Frier, Clément

arXiv.org Artificial IntelligenceJan-29-2024

Animals often demonstrate a remarkable ability to adapt to their environments during their lifetime. They do so partly due to the evolution of morphological and neural structures. These structures capture features of environments shared between generations to bias and speed up lifetime learning. In this work, we propose a computational model for studying a mechanism that can enable such a process. We adopt a computational framework based on meta reinforcement learning as a model of the interplay between evolution and development. At the evolutionary scale, we evolve reservoirs, a family of recurrent neural networks that differ from conventional networks in that one optimizes not the synaptic weights, but hyperparameters controlling macro-level properties of the resulting network architecture. At the developmental scale, we employ these evolved reservoirs to facilitate the learning of a behavioral policy through Reinforcement Learning (RL). Within an RL agent, a reservoir encodes the environment state before providing it to an action policy. We evaluate our approach on several 2D and 3D simulated environments. Our results show that the evolution of reservoirs can improve the learning of diverse challenging tasks. We study in particular three hypotheses: the use of an architecture combining reservoirs and reinforcement learning could enable (1) solving tasks with partial observability, (2) generating oscillatory dynamics that facilitate the learning of locomotion tasks, and (3) facilitating the generalization of learned behaviors to new tasks unknown during the evolution phase.

agent, experiment, reservoir, (15 more...)

arXiv.org Artificial Intelligence

2312.06695

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Improved Robustness and Safety for Pre-Adaptation of Meta Reinforcement Learning with Prior Regularization

Wen, Lu, Zhang, Songan, Tseng, H. Eric, Singh, Baljeet, Filev, Dimitar, Peng, Huei

arXiv.org Artificial IntelligenceFeb-9-2023

Meta Reinforcement Learning (Meta-RL) has seen substantial advancements recently. In particular, off-policy methods were developed to improve the data efficiency of Meta-RL techniques. \textit{Probabilistic embeddings for actor-critic RL} (PEARL) is a leading approach for multi-MDP adaptation problems. A major drawback of many existing Meta-RL methods, including PEARL, is that they do not explicitly consider the safety of the prior policy when it is exposed to a new task for the first time. Safety is essential for many real-world applications, including field robots and Autonomous Vehicles (AVs). In this paper, we develop the PEARL PLUS (PEARL$^+$) algorithm, which optimizes the policy for both prior (pre-adaptation) safety and posterior (after-adaptation) performance. Building on top of PEARL, our proposed PEARL$^+$ algorithm introduces a prior regularization term in the reward function and a new Q-network for recovering the state-action value under prior context assumptions, to improve the robustness to task distribution shift and safety of the trained network exposed to a new task for the first time. The performance of PEARL$^+$ is validated by solving three safety-critical problems related to robots and AVs, including two MuJoCo benchmark problems. From the simulation experiments, we show that safety of the prior policy is significantly improved and more robust to task distribution shift compared to PEARL.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IROS47612.2022.9981621

2108.08448

Country: North America > United States > Michigan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

BIMRL: Brain Inspired Meta Reinforcement Learning

Rohani, Seyed Roozbeh Razavi, Hedayatian, Saeed, Baghshah, Mahdieh Soleymani

arXiv.org Artificial IntelligenceOct-29-2022

Sample efficiency has been a key issue in reinforcement learning (RL). An efficient agent must be able to leverage its prior experiences to quickly adapt to similar, but new tasks and situations. Meta-RL is one attempt at formalizing and addressing this issue. Inspired by recent progress in meta-RL, we introduce BIMRL, a novel multi-layer architecture along with a novel brain-inspired memory module that will help agents quickly adapt to new tasks within a few episodes. We also utilize this memory module to design a novel intrinsic reward that will guide the agent's exploration. Our architecture is inspired by findings in cognitive neuroscience and is compatible with the knowledge on connectivity and functionality of different regions in the brain. We empirically validate the effectiveness of our proposed method by competing with or surpassing the performance of some strong baselines on multiple MiniGrid environments.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IROS47612.2022.9981250

2210.1653

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Meta Reinforcement Learning for Optimal Design of Legged Robots

Belmonte-Baeza, Álvaro, Lee, Joonho, Valsecchi, Giorgio, Hutter, Marco

arXiv.org Artificial IntelligenceOct-6-2022

The process of robot design is a complex task and the majority of design decisions are still based on human intuition or tedious manual tuning. A more informed way of facing this task is computational design methods where design parameters are concurrently optimized with corresponding controllers. Existing approaches, however, are strongly influenced by predefined control rules or motion templates and cannot provide end-to-end solutions. In this paper, we present a design optimization framework using model-free meta reinforcement learning, and its application to the optimizing kinematics and actuator parameters of quadrupedal robots. We use meta reinforcement learning to train a locomotion policy that can quickly adapt to different designs. This policy is used to evaluate each design instance during the design optimization. We demonstrate that the policy can control robots of different designs to track random velocity commands over various rough terrains. With controlled experiments, we show that the meta policy achieves close-to-optimal performance for each design instance after adaptation. Lastly, we compare our results against a model-based baseline and show that our approach allows higher performance while not being constrained by predefined motions or gait patterns.

machine learning, optimization, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2022.3211785

2210.0275

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.84)

Add feedback

On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies

Wang, Haozhi, Wang, Qing, Shao, Yunfeng, Li, Dong, Hao, Jianye, Li, Yinchuan

arXiv.org Artificial IntelligenceSep-20-2022

Modern meta-reinforcement learning (Meta-RL) methods are mainly developed based on model-agnostic meta-learning, which performs policy gradient steps across tasks to maximize policy performance. However, the gradient conflict problem is still poorly understood in Meta-RL, which may lead to performance degradation when encountering distinct tasks. To tackle this challenge, this paper proposes a novel personalized Meta-RL (pMeta-RL) algorithm, which aggregates task-specific personalized policies to update a meta-policy used for all tasks, while maintaining personalized policies to maximize the average return of each task under the constraint of the meta-policy. We also provide the theoretical analysis under the tabular setting, which demonstrates the convergence of our pMeta-RL algorithm. Moreover, we extend the proposed pMeta-RL algorithm to a deep network version based on soft actor-critic, making it suitable for continuous control tasks. Experiment results show that the proposed algorithms outperform other previous Meta-RL algorithms on Gym and MuJoCo suites.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2209.10072

Country:

Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adaptive Adversarial Training for Meta Reinforcement Learning

Chen, Shiqi, Chen, Zhengyu, Wang, Donglin

arXiv.org Artificial IntelligenceApr-27-2021

Meta Reinforcement Learning (MRL) enables an agent to learn from a limited number of past trajectories and extrapolate to a new task. In this paper, we attempt to improve the robustness of MRL. We build upon model-agnostic meta-learning (MAML) and propose a novel method to generate adversarial samples for MRL by using Generative Adversarial Network (GAN). That allows us to enhance the robustness of MRL to adversal attacks by leveraging these attacks during meta training process.

adversarial attack, arxiv preprint arxiv, reinforcement, (12 more...)

arXiv.org Artificial Intelligence

2104.13302

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.98)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning to Learn More: Meta Reinforcement Learning

#artificialintelligenceOct-29-2020, 06:15:38 GMT

The ELI5 definition for Reinforcement Learning would be training a model to perform better by iteratively learning from its previous mistakes. Reinforcement learning provides a framework for agents to solve problems in case of real-world scenarios. They are able to learn rules (or policies) to solve specific problems, but one of the major limitations of these agents are that they are unable to generalize the learned policy to newer problems. A previously learned rule would cater to a specific problem only, and would often be useless for other (even similar) cases. A good meta-learning model on the other hand, is expected to generalize to new tasks or environments that have not been encountered by the model in training.

artificial intelligence, learning, reinforcement learning, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback